Likelihood-ratio ranking of gravitational-wave candidates in a non-Gaussian background. 
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We describe a general approach to detection of transient gravitational-wave signals in the presence of non- 
Gaussian background noise. We prove that under quite general conditions, the ratio of the likelihood of observed 
data to contain a signal to the likelihood of it being a noise fluctuation provides optimal ranking for the candi- 
date events found in an experiment. The likelihood-ratio ranking allows us to combine different kinds of data 
into a single analysis. We apply the general framework to the problem of unifying the results of independent 
experiments and the problem of accounting for non-Gaussian artifacts in the searches for gravitational waves 
from compact binary coalescence in LIGO data. We show analytically and confirm through simulations that in 
both cases the likelihood ratio statistic results in an improved analysis. 

PACS numbers: 04.80.Nn, 07.05. Kf, 95.55.Ym 



I. INTRODUCTION 

The detection of gravitational waves from astrophysical 
sources is a long-standing problem in physics. Over the past 
decade, the experimental emphasis has been on the construc- 
tion and operation of kilometer-scale interferometric detec- 
tors such as Laser Interferometer Gravitational-wave Obser- 
vatory (LIGO) Qtl- The instruments measure the strain, s(t), 
by monitoring light at the interferometer's output port, which 
varies as test masses that are suspended in vacuum at the ends 
of orthogonal arms differentially approach and recede by mi- 
nuscule amounts. The strain signal, s(t), is a combination of 
noise, n(t), and gravitational-wave signal, h(t). 

There is a well established literature describing the analy- 
sis of time-series data for signals of various types H; these 
methods have been extended to address gravitational-wave de- 
tection yfl. This approach usually begins with the assump- 
tion that the detector noise, n(t), is stationary and Gaussian. 
Then one proceeds to derive a set of filters that are tuned to 
detect the particular signals in this time-series data. The re- 
sult is both elegant and powerful: whitened detector noise 
is correlated with a whitened version of the expected signal. 
The approach has been used to develop techniques to search 
for gravitational waves from compact binary coalescence, iso- 
lated neutron stars, stochastic sources, and generic bursts with 
certain time-frequency characteristics J3]. 

This approach takes the important first step of design- 
ing filters that properly suppress the dominant, frequency- 
dependent noise sources in the instrument. The simplicity 
of the filters is due to the fact that the power-spectral den- 
sity fully characterizes the statistical properties of stationary, 
Gaussian noise. However, interferometric detectors are prone 



to non-Gaussian and non-stationary noise sources. Environ- 
mental disturbances, including seismic, acoustic, and electro- 
magnetic effects, can lead to artifacts in the time series that 
are neither gravitational waves nor stationary, Gaussian noise. 
Imperfections in hardware can lead to unwanted signals in the 
time series that originate from auxiliary control systems. 

To help identify and remove these unwanted signals, instru- 
ments have been constructed at geographically separated sites 
and the data are analyzed together. A plethora of diagnos- 
tics have also been developed to characterize the quality of 
the data JH-Q]. Searches for gravitational waves use more 
than just the filtered output of the time-series, s(t), to sepa- 
rate gravitational-wave signals from noise. Moreover, the re- 
sponses from various filters indicate that the underlying noise 
sources are not Gaussian, even after substantial data quality 
filtering and coincidence requirements have been applied. 

In this paper, we discuss using likelihood-ratio ranking as 
a unified approach to gravitational-wave data analysis. The 
approach foregoes the stationary, Gaussian model of the de- 
tector noise. The output of the filters derived under that as- 
sumption becomes one element in a list of parameters that 
characterize a gravitational-wave detection candidate. The de- 
tection problem is then couched in terms of the statistical prop- 
erties of an n-tuple of derived quantities, leading directly to a 
likelihood-ratio ranking for detection candidates. The n-tuple 
can include more information than simply the signal-to-noise 
ratio (SNR) measured in each instrument of the network. It 
can include measures of data quality, the physical parameters 
of the gravitational-wave candidate, the SNR from the coher- 
ent and null combinations of the detector signals; it can in- 
clude nearly any measure of detector behavior or signal qual- 
ity. 
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This approach was already used to developranking statis- 
tics for compact binary coalescence signals HSl— TlQII and is at 
the core of a powerful coincidence test developed for burst 
searches ifTTll . 

This work presents a general framework for the likelihood- 
ratio ranking in the context of gravitational-wave detection. 
We explore its analytical properties and illustrate its practical 
value by applying it to two data analysis problems arising in 
real-life searches for gravitational waves in LIGO data. 



II. GENERAL DERIVATION OF LIKELIHOOD-RATIO 
RANKING 

Let the n-tuple c denote the observable data in some experi- 
ment that aims to detect a signal denoted by h. This signal can 
usually be parametrized by several continuous parameters that 
may be unknown, for example distance to the source of grav- 
itational waves and location on the sky. The purpose of the 
experiment is to identify the signal. Depending on whether 
a Bayesian or frequentist statistical approach is taken, this 
is stated in terms of either the probability that the signal is 
present or the probability that the observed data are a noise 
fluctuation. 

In this section, we show that both approaches lead to rank- 
ing candidate signals according to the likelihood ratio 



A(c) 



fp(c\h, l)p(h| l)dh 
W\ 0) 



(1) 



where p(c h, 1) is the probability of observing c in the pres- 
ence of the signal h, p(h 1 1) is the prior probability to receive 
that signal, and p(c | 0) is the probability of observing c in the 
absence of any signal. The higher a candidate's A value, the 
more likely it is a real signal. 



A. Bayesian Analysis 

In this approach, we compute the probability that a signal 
is present in the observed data, p(l\c). By a straightforward 
application of Bayes theorem, we write 



p(l\S) = 



p(c | l)p(l) 



p(c| l)p(l) + p(c*[0)p(0) 

/p(c I h,l)p(hll)p(l)dh 
/ p(c| h, l)p(h | l)p(l) dh + p(c| 0)p(0) ' 



(2) 



where p(0) is the a priori probability that the signal is absent 
and p(l) is the a priori probability that there is a signal (of 
any kind). The denominator re-expresses p(c) in terms of the 
two possible independent outcomes: the signal is present or 
the signal is absent. Upon successive division of numerator 
and denominator by p(c | 0) and p(l), we find 



p(l|c) 



A(c) 



A(c)+p(0)/p(l) 



(3) 



which is a monotonically increasing function of the likelihood 
ratio A defined by Eq. {O Q. Hence, the larger the likelihood 
ratio, the more probable it is that a signal is present. 



B. Frequentist Approach 

The process of detection can always be reduced to a bi- 
nary "yes" or "no" question — does the observed data contain 
the signal? An optimal detection scheme should achieve the 
maximum rate of successful detections — correctly given "yes" 
answers — with some fixed, preferably low, rate of false alarms 
or false positives — incorrectly given "yes" answers. This is 
the essence of the Neyman-Pearson optimality criteria for de- 
tection, which states that an optimal detector should maxi- 
mize the probability of detection at a fixed probability of false 
alarm lHj. 

As before, let the n-tuple c denote the observable data and 
h the signal that is the object of the search. Without loss of 
generality, any decision-making algorithm can be mapped into 
a real function, /(c), of the data that signifies detection when- 
ever its value is greater than or equal to a threshold value, F*. 
Thus, using the Neyman-Pearson formalism, an optimal detec- 
tor is realized by finding a function, /(c), that maximizes the 
probability of detection at a fixed value of the probability of 
false alarm. The probability of detection, Pi, is 



(4) 



Pi= I I e(/(ci-p*)p(c|h,i)p(h|i)p(i)dhdc, 

Jv d Jv b 

and the probability of false alarm0, Po, is 

9(/(c)-F*)p(c|0)p(0)dc, (5) 



where Vh identifies the subset of signals targeted by the search, 
Vd denotes the subset of accessible data and integration is per- 
formed over all signals, h, and data points, c, within these 
subsets. Treating Pi and Po as functionals of /(c), we find 
that for an optimal detector, the variation of 



(6) 



should vanish. Here Iq denotes a Lagrange multiplier and P*is 
a constant that sets the value of the probability of false alarm. 
The variation of Eq. © with respect to /(c) gives 



SS= I 5(f(c)-F*)Sf(c) 



p(c I h, l)p(h | l)p(l) dh - l p(c\ 0)p(0) 



dc. 
(7) 



1 This ratio of likelihoods is also known as the Bayes factor. 

2 This is similar, but not exactly equal, to the false-alarm probability or Type 
I error, which assumes the case where no signal is present, that is, does not 
include the term p(0). 
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Variations Sf(c) at different data points are independent, thus 
implying that after integration over c, the condition 



/ P (<?|h,i)p(h|i)dh _ mo) 



p(c* I 0) 



p(l) 



= const 



(8) 



must be satisfied at all points c* for which the argument of the 
delta function satisfies the condition 



f(c*)-F* = 0. 



(9) 



This latter condition defines the detection surface separating 
the detection and non-detection regions. Note that as /(c) 
varies, the shape of this surface changes accordingly. There- 
fore, Eq. implies that the detection surface must be the 
surface of the constant likelihood ratio for an optimal detec- 
tor. This is the only condition on the functional form of /(c). 
Variation with respect to F* does not give a new condition, 
whereas variation with respect to the Lagrange multiplier, Iq, 
simply sets the probability of a false alarm to be P*.0 

A natural way to satisfy the optimality criteria is to use the 
likelihood ratio 



A(c) 



/p(c|h,l)p(h|l)dh 

P(c\0) 



(10) 



or any function /(A(c)) for ranking the candidate signals. 
With this choice, the optimality condition Eq. ^ is satisfied 
for any threshold F*. The latter is determined by the choice 
of an admissible value of the probability of false alarm, P* , 
through 

P [f(A(S))] = P*. (11) 



C. Variation of efficiency with volume of search space 

The likelihood ratio given in Eq. (0 is guaranteed to max- 
imize the probability of signal detection for a given search. 
Because optimization is performed for a fixed region defined 
by all possible values of a candidate's parameters, c, in the 
search, it is unclear whether increasing the volume of avail- 
able data (e.g. extension of the bank of template waveforms) 
would not result in an overall decrease of probability to detect 
signals. For example, one may be apprehensive of the poten- 
tial increase in the rate of false alarms solely due to extension 
of the parameter space. Intuitively, having more available in- 
formation should not negatively affect the detection probabil- 
ity or efficiency if the information is processed correctly. In 
what follows, we prove that this is true if the likelihood ratio 
is used for making the detection decision. 



3 In the case of the mixed data, when c includes continuous as well as dis- 
crete parameters, integration in the expressions for Pi and Po should be 
replaced by summation wherever it is appropriate. This does not affect the 
derivation or the main result. The notion of optimal detection surface de- 
fined by Eq. {8} is straightforward to generalize to include both continuous 
and discrete data. 



To prove that the detection efficiency does not decrease 
when the volume of data is increased, we must show that the 
variation of 5 Pi / SVd at a fixed Po is non-negative. Consider a 
foliation of the space of data, Va, by surfaces of constant like- 
lihood ratio, 5a- Functionals for the probabilities of detection, 
Eq. and false alarm, Eq. (O can be written as 



and 



Pi 



Po 



dA / 6(A-A*)p(c|l)p(l), (12) 
Js A 



dA 6 (A - A*) p(c | 0)p(0) , (13) 



where, for brevity, we absorbed explicit integration in the 
space of signals, Vh, in the product p(c \ l)p(l). Pi is a func- 
tional of Vd and A*. Since the latter is determined by the 
value chosen for false alarm probability, Po = P* , and the 
probability of false alarm also depends on Vd, variations of 
Vd and A* are not independent. To find the relation, we vary 
the probability of false alarm 



5Pq 



dA / 5 (A - A*)p(c\ 0)p(0)(5A* 

JSa 



(14) 



+ / dA / e(A-A*)p(c|0)p(0)<SS A . 

Jo JSSa 



We consider non-negative variations of surfaces of constant 
likelihood ratio, SSa, that correspond only to the addition of 
new data points to Vd, and therefore correspond only to an 
extension of surfaces, 5a, without an overall translation or 
change of shape. 

The probability of false alarm should stay constant, there- 
fore its variation should vanish, providing the relation 

, A ,_ /o 00 dA/ gSA e(A-A*)p(c|OM5A 

J Sa Mc\o) ■ C } 

Next, we vary the functional for the detection probability 



<5Pi = - A* / p(c\ 0)p(l)SA* 
Js A , 



(16) 



dA / Q(A- A*)Ap(c\0) P (l)SS A , 

JSSa 



where we use p(c 1) = A(c)p(c 0), which follows from the 
definition of the likelihood ratio. Eliminating <5A* by means 
of Eq. ( fT~5b and re-arranging terms we get 

5Pi=p(l) [ dA f 9 (A - A*) (A - A*)p(c 1 0)5S A , 

J0 JsS A 

(17) 

which is non-negative for all positive 6Sa by virtue of 
9 (A - A*) (A - A*) > 0. This proves that if the likelihood- 
ratio statistic is used in the detection process, the probability 
of detection can never decrease during an extension of the vol- 
ume of available data. 
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HI. APPLICATIONS 



In Section lTlI Al we apply the formalism of Sectionllllwhen 
assessing the significance of triggers between experiments 
on disjoint times. In Section UlIBI we demonstrate how the 
likelihood-ratio ranking can improve analysis efficiency by 
accounting for non-Gaussian features in the distributions of 
parameters of the candidates events. 



A. Combining disjoint experiments 

One complexity that arises in real-world applications is the 
necessity to combine results from multiple independent exper- 
iments. For example, gravitational-wave searches are often 
thought of in terms of times when a fixed number of interfer- 
ometers are operating. If a network consists of instruments 
that are not identical and located at different places, each com- 
bination of operating interferometers may have very differ- 
ent combined sensitivity and background noise. Times when 
three interferometers are recording data may be treated differ- 
ently from those when any pair is operating. Ideally, these 
experiments would be treated together accounting for differ- 
ences in detectors' sensitivities and background noise in the 
ranking of the candidate signals, but it is often not practical 
(see how this problem was addressed in ifioll ). In this sec- 
tion, we show that the likelihood-ratio ranking offers a natural 
solution to this problem, which is conceptually similar to a 
simplified approach taken in ifioll . 

Consider a situation in which the data is written as c = 
(d, j), where j = 0, 1, 2, . . . indicates that the data arose from 
an experiment covering some time interval Tj . Note that T n 
Tj = if j 7^ i. The probability that a signal is present given 
the data is 



P(l\d,j) 



fp(d,j\h,l)p(h\l)p(l)dh 



Jp(d, j | h, l)p(h | l)p(l) dh + p(d, j | 0)p(0) 

(18) 

The conditional probabilities for the observed data can be fur- 
ther expanded as 

p(d, j | h, 1) = p{d\ h, l)p(j | h, 1) (19) 
p(d,j\0)=p(d\j,0)p(J\0), (20) 

where we introduce p(j | h, 1) and p(j | 0) — the probabilities 
for the observed data to be from the j th experiment given the 
presence or absence of a signal respectively. It is reasonable to 
assum^3thatp(j | h, 1) = p(j | 0). In this case, both probabili- 
ties drop out of Eq. ( fT8l ), and the expression for the probability 



4 This is not strict equality. Gravitational-wave events can alter the amount of 
live time in experiments to detect them. For example, an alert sounds in the 
LIGO and Virgo control rooms when gamma-ray bursts are detected, which 
sometimes accompany CBCs. The alert prompts operators to avoid routine 
maintenance and hardware injections, with their associated deadtimes, for 
the following forty minutes. 



for a signal to be present in the data can be written as 



P0-\d,j) 



A(dJ)+ P (0)/p(l) 



(21) 



with the likelihood ratio A(d, j) given by 



A(d,j) 



J P (d\j,h,l)p(h) dh 
P(d\j,0) 



(22) 



Comparing Eq. OTb with Eq. d3}, we conclude that the like- 
lihood ratio A(d, j), evaluated independently for each experi- 
ment, provides optimal unified ranking. In terms of their like- 
lihood ratios, data samples from different experiments can be 
compared directly, with differences in experiments' sensitivi- 
ties and noise levels being accounted for by p(d | j, h, 1) and 
P(d\j,0). 

Following the steps outlined in Section HTBl the same result 
can also be attained by direct optimization of the combined 
probability of detection at the fixed probability of false alarm. 
Optimality guarantees that the results of the less sensitive ex- 
periment can be combined with the results of the more sensi- 
tive experiment without loss of efficiency. In this approach, a 
unified scale provided by the likelihood ratio, A(d,j), is ex- 
plicit because, by construction, the same threshold, F*, is ap- 
plied to all data samples. F* is determined by the value of the 
probability of false alarm for the combined experiment, given 
by 

p ° = E / e ( A 0fo) - F *) J, 0)p(j | 0)p(0) dd, 

(23) 

which makes the whole process less trivial. Notice that p{ j | 0) 
(often approximated by Tj/ J2i ^i) appears in the expression 
for Po, however it does not appear in the expression for the 
likelihood ratio given in Eq. d22l . Since p(j | 0) is propor- 
tional to the experiment duration, Tj, each experiment is 
weighted appropriately in the total probability of false alarm. 
In a similar fashion, experiment durations appear in the expres- 
sion for the combined efficiency or the probability of detection 
for the combined experiment. 



B. Combining search spaces 

Sophisticated searches for gravitational-wave signals from 
compact binary coalescence [floL [l3l - [l5ll have been developed 
over the past decade. The non-Gaussian and non-stationary 
noise is substantially suppressed by the application of instru- 
mental and environmental vetoes 13-01, coincidence between 
detectors, and numerous other checks on the quality of pu- 
tative gravitational-wave signals. Nevertheless, the number 
of background triggers as a function of SNR depends on the 
masses of the binaries targeted in a search. For this rea- 
son, triggers have been divided into categories based on the 
chirp mass, A4, of the filters that produced the trigger (where 
M. = ((TOi?ri2) 3 /(TOi + TO2)) 1 / 5 and mi and rri2 are the 
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Simulated compact binary search 




FIG. 1 . Graphic representation of the model background distribution 
of Eq. f24t f or a = 2.0. Shaded areas define the regions of non-zero 
probability. 



masses of the compact objects in the binary). The background 
is a slowly varying function of Ai, falling off more rapidly, as 
a function of SNR, for smaller values of M.. This is a man- 
ifestation of non-Gaussianities still present in the data. It is 
desirable to account for this dependence when ranking candi- 
dates found in the search. 

In this section, we consider a toy problem that mimics the 
properties of the compact binary search but demonstrates how 
the likelihood-ratio ranking matches our intuition. Following 
that example, we present the results of a simulated compact bi- 
nary search and demonstrate that the detection statistic based 
on the likelihood ratio accounts for non-Gaussian features in 
background distribution and improves search efficiency. 



1. Toy Problem 

Consider an experiment in which the data that define a can- 
didate are c = (p, x), where p is the SNR and x is the extra 
parameter describing the data sample (e.g. the chirp mass of 
the binary). Suppose the distribution of the data in the absence 
of a signal is 



p(p, x | 0) =Apcxp(-p 2 )e(x)Q(l - x) 

+ BO(x + l)Q(-x)e(p)Q(a 



P) 



(24) 



Figure [Uprovides a graphic representation of this distribution. 
Notice that p(p, x\0) = for x < and p > a, therefore 
data (p, x) in this region of the plane indicates the presence of 
a signal with unit probability. This intuition is clearly borne 
out in the above analysis since 



For the purpose of simulating a real-life search we use data 
from LIGO's fourth science run, February 24-March 24, 2005. 
The data was collected by three detectors: the HI and H2 co- 
located detectors in Hanford, WA, and the LI detector in Liv- 
ingston, LA. 

The search targets three types of binaries: neutron 
star-neutron star (BNS), neutron star-black hole (NSBH) 
and black hole-black hole (BBH). To model signals from 
these systems, we use non-spinning, post-Newtonian wave- 
forms IU6l426ll that are Newtonian order in amplitude and sec- 
ond order in phase, calculated using the stationary phase ap- 
proximation ltl7l 124 12511 with the upper cut-off frequency set 
by the Schwarzschild innermost stable circular orbit. We gen- 
erate three sets of simulated signals, one for each type of bi- 
nary. The neutron star masses are chosen randomly in the 
range 1-3 M Q , while the black hole masses are restricted such 
that the total binary mass is between 2-35 M Q . The maxi- 
mum allowed distance for the source systems is set to 20 Mpc 
for BNS, 25 Mpc for NSBH and 60 Mpc for BBH. These 
distances roughly correspond to the sensitivity range of the 
detectors in this science run. All other parameters, including 
the location of the source on the sky, are randomly sampled. 
The simulated signals are distributed uniformly in distance. In 
order to represent realistic astrophysical population with prob- 
ability density function scaling as distance squared, the sim- 
ulated signals are appropriately re-weighted and are counted 
according to their weights. The simulated signals from each 
set are injected into non-overlapping 2048-second blocks of 
data and analyzed independently. 

Analysis of the data is performed using the low-mass CBC 
pipeline dEM! HI • 

It consists of several stages. First, 
the data recorded by each interferometer is match-filtered with 
the bank of non-spinning, post-Newtonian template wave- 
forms covering all possible binary mass combinations with 
total mass in the range 2-35 M Q . The template waveforms 
come from the same family as the simulated signal wave- 
forms previously described. When the SNR time series for 
a particular template crosses the threshold of 5.5, a single- 
interferometer trigger is recorded. This trigger is then sub- 
jected to waveform consistency tests, followed by consistency 
testing with triggers from the other interferometers. To be pro- 
moted to a gravitational-wave candidate, a signal is required to 
produce triggers with similar mass parameters in at least two 
interferometers within a very short time window (set by the 
light travel time between the detectors). The surviving coinci- 
dent triggers are ranked according to the combined effective 
SNR statistic given by 



p(h\p,x) 



p(p, x I h)p(h) 
p(p, x I h)p(h) + 



(25) 



for {(p,x)\x < and p > a}, compare this equation with 
Eq. d2j. The likelihood ratio for these data points is infinite, 
reflecting complete certainty that the data samples from this 
region are signals. 



JV 
i=l 



(26) 



where the sum is taken over the triggers from different de- 
tectors that were identified to be in coincidence and the phe- 
nomenologically constructed effective SNR for a trigger is de- 
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fined as 



Peff 



(27) 



where p is the SNR, the phenomenological denominator fac- 
tor r = 250, and p is the number of bins used in the \ 2 test, 
which is a measure of how much the signal in the data matches 
the template l28ll . In the denominator of Eq. d27i i, % 2 is nor- 
malized by 2p — 2, the number of degrees of freedom for this 
test. 

All steps in the analysis beyond calculation of the SNR, p, 
are designed to remove non-Gaussian noise artifacts. Experi- 
ence has shown that if properly tuned, these extra steps signif- 
icantly reduce the number of false alarms J27ll . Yet typically, 
the resulting output of the analysis is still not completely free 
of instrumental artifacts. Triggers that survived the pipeline's 
initial tests include unsuppressed noise artifacts. The general 
formalism developed in Section [II] can be applied to further 
classify these triggers with the aim of optimally separating sig- 
nals from the noise artifacts. Each trigger is characterized by 
a vector of parameters which, in addition to the combined ef- 
fective SNR, p c , may include the chirp mass, A4, difference in 
the time of arrivals at different detectors etc. Such information 
as which detectors detected the signal and what was the data 
quality at the time of detection can be also folded in as a dis- 
crete trigger parameter. For such parametrized data, the prob- 
ability distributions in the presence and absence of a signal 
can be estimated via direct Monte Carlo simulations. These 
distributions, if estimated correctly, include a non-Gaussian 
component. The triggers are ranked by their likelihood ratios, 
Eq. (fTJ, which results in the optimized search in the parameter 
space of triggers. 

Extra efficiency gained by additional processing of the trig- 
gers depends strongly on the extent to which the non-Gaussian 
features of the background noise are reflected in the distribu- 
tion of the trigger parameters. In the context of the search 
for gravitational waves from compact binary coalescence in 
LIGO data, the chirp mass of a trigger's template waveform 
is one of the parameters that exhibits a non-trivial background 
distribution. For a given A4, the number of background trig- 
gers falls off with increasing combined effective SNR, p c , of 
the trigger. The rate of falloff is slower for templates with 
higher chirp mass, reflecting the fact that non-Gaussian noise 
artifacts are more likely to generate a trigger for templates 
with smaller bandwidth. Another important piece of informa- 
tion about a trigger is the number and type of detectors that 
produced it. Generally, detectors differ by their sensitivities 
and level of noise. In the case we are concern with, two de- 
tectors, HI and LI, have comparable sensitivities which are 
factor of two higher than the sensitivity of the smaller H2 de- 
tector. This configuration implies that the signals within the 
sensitivity range of the H2 detector are likely to be detected in 
all three instruments forming a set of triple triggers, H1H2L1. 
The signals beyond the reach of the H2 detector can only be 
detected in two instruments forming a set of double triggers, 
H1L1 . Detection of a true signal by another two detector com- 
binations, H1H2 and H2L1, is very unlikely, therefore such 



triggers are discarded in the search. The number density of 
astrophysical sources grows as distance squared. As conse- 
quence, it is more likely that a gravitational-wave signal is 
detected as an H1L1 double trigger. On the other hand, back- 
ground of H1H2L1 triggers is much cleaner due to the fact 
that instrumental artifacts are less likely to occur in all three 
detectors simultaneously. These competing factors should be 
included in the ranking of the candidate events in order to op- 
timize probability of detection. 

It is natural to expect that inclusion of such information 
about the triggers in the ranking, in addition to the combined 
effective SNR, should help distinguishing signals from noise 
artifacts. The first step is to estimate distribution of trigger 
parameters for signals and background. For background esti- 
mation, we use the time-shifted data — the standard technique 
employed in the searches for transient gravitational-wave sig- 
nals in LIGO data Q3.E3HI131 ■ We perform 200 time shifts 
of the data recorded by LI with respect to the data taken by 
the HI and H2 detectors. The time lags are multiples of 5 
seconds. 

Analysis of time-shifted data provides us with a sample of 
the background distribution of the combined effective SNRs 
forHlLl andHlH2Ll triggers with various chirp masses. We 
find that all triggers can be subdivided into three chirp mass 
bins: 0.87 < M c /M e < 3.48, 3.48 < M c /M e < 7.4, 
and 7.4 < M C /M Q < 15.24. These correspond to equal 
mass binaries with total masses of 2-8 A/©, 8-17 A/© and 17- 
35 Mq. These same bins were used in the analyses of the data 
fromLIGO's S5 and Virgo's VSR1 science runs (loj [Tl [l5]l . 
Within each bin, the background distributions depend weakly 
on chirp mass, thus there is no need for finer resolution. At the 
same time, the distributions of the combined effective SNR in 
different bins show progressively longer tails with increasing 
chirp mass. 

The distribution of triggers for gravitational-wave signals 
is simulated by injecting model waveforms into the data and 
analyzing them with the pipeline. This is done independently 
for each source type: BNS, NSBH and BBH. 

Following the prescription for optimal ranking outlined in 
Section [TTJ we treat each trigger as a vector of data c = 
(p c ,a, m), where a denotes the type of the trigger, double 
H1L1 or triple H1H2L1, and m is a discrete index labeling 
the chirp mass bins. We construct the likelihood-ratio rank- 
ing, A(p c , a, m | Sj) for each binary type, where Sj stands for 
BNS, NSBH or BBH. Note that the likelihood ratio has strong 
dependence on the binary type, Sj . To simplify calculations, 
we approximate the likelihood ratio by 



A(p c ,a,m\Sj) 



»inj(fc,a,m) 
w s iide(pc,a,m) ' 



(28) 



where n° in - (p c , a, m) is the fraction of injected signals of Sj 
type that produce a trigger of type a with p' c > p c in the chirp 
mass bin m, and n s u4 e {p e s, a, m) is the fraction of time shifts 
of the data that produce a trigger of type a with p' Q > p c in the 
same chirp mass bin. This approximation is equivalent to us- 
ing cumulative probability distributions instead of probability 
densities. It is expected to be reasonably good for the tails of 
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FIG. 2. Efficiency in detecting BNS signals versus false-alarm rate 
computed for various rankings. The solid curve corresponds to the 
likelihood-ratio ranking, A, with uniform prior p s (Sj) = (1, 1, 1). 
The dashed curve is the likelihood-ratio ranking, A, with the prior 
Ps(Sj) = (1, 0, 0), singling out BNS binaries for detection. The dot- 
ted curve represents the standard search with the combined effective 
SNR ranking, p c fl . 



FIG. 3. Efficiency in detecting NSBH signals versus false alarm rate 
computed for various rankings. The solid curve corresponds to the 
likelihood-ratio ranking, A, with uniform prior p s (Sj) = (1,1,1). 
The dashed curve is the likelihood-ratio ranking, A, with the prior 
Ps(Sj) = (0, 1, 0), singling out NSBH binaries for detection. The 
dotted curve represents the standard search with the combined effec- 
tive SNR ranking, p e ff ■ 



probability distributions that fall off as a power law or faster. 
The case we consider here falls in this category. 

We apply the new ranking statistic given by Eq. d28l l to 
all triggers: background and signals. Each trigger has three 
likelihood ratios, one for each binary type. We introduce a 
prior distribution for binary types, p s (Sj). It can either en- 
code our knowledge about astrophysical populations of bi- 
naries or relative "importance" of different types of binaries 
to the search. In what follows we consider four alternatives: 
p a (Sj) = (1,0,0), p s (Sj) = (0,1,0), 0.(5,) = (0,0,1) and 
Ps(Sj) = (1,1,1). The first three singles out one of the binary 
types, whereas the last one treats all binaries on equal footing. 
Finally, the ranking statistic is defined as 

A(p c ,a,m) = maxA(p c ,a!,m| Sj)p s (S 3 ) . (29) 

Sj 

The four alternative choices for p s {Sj) define four searches. 
For example, p s (Sj) = (1, 0, 0) corresponds to the search tar- 
geting only gravitational-wave signal from BNS coalescence. 
Similarly, p s (Sj) = (0, 1, 0) andp a (Sj) = (0, 0, 1) define the 
searches for gravitational waves from NSBH and BBH. The 
uniform prior, p s (Sj) = (1, 1, 1), allows one to detect all sig- 
nals without giving priority to one type over the others. In 
each of the searches, the likelihood-ratio ranking, Eq. j29l , re- 
weights triggers giving higher priority to those that are likely 
to be the targeted signal as oppose to noise. 

In order to assess the improvement attained by the new rank- 
ing, we compute efficiency in recovering simulated signals 
from the data as a function of the rate of false alarms. For 
a given rate of false alarms we find the corresponding value 



1.0 1 — 
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-- A, (0,0,1) prior 
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FIG. 4. Efficiency in detecting BBH signals versus false alarm rate 
computed for various rankings. The solid curve corresponds to the 
likelihood-ratio ranking, A, with uniform prior p 3 (Sj) = (1,1,1). 
The dashed curve is the likelihood-ratio ranking, A, with the prior 
Ps(Sj) = (0,0, 1), singling out BBH binaries for detection. The 
dotted curve represents the standard search with the combined effec- 
tive SNR ranking, p e ff ■ 
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of the ranking and define efficiency as ratio of injected sig- 
nals ranked above this value to the total number of signals that 
passed initial cuts of the analysis pipeline. This is equivalent 
to computing the standard receiver-operator curve Pi (Pq) de- 
fined by Eqs. ©-(O. The Efficiency curves are computed for 
BNS, NSBH and BBH binaries. In each case we evaluate ef- 
ficiency of both likelihood-ratio rankings, the one that targets 
only that type of binary and the one that applies the uniform 
prior, Ps(Sj) = (1, 1, 1). We compare the resulting curves 
to the efficiency curve for the standard analysis pipeline that 
uses the combined effective SNR, p c as the ranking statistic. 
These curves are shown in Figures[2H4] 

They reveal that the searches targeting single type of binary, 
represented by the dashed curves, are more sensitive than the 
uniform search, the solid curve. This is expected, because 
narrowing down the space of signals typically allows one to 
discard the triggers that mismatch the signal's parameters re- 
ducing the rate of false alarms without loss of efficiency in re- 
covering these signals. For instance, the search targeting BNS 
only signals discards all triggers from the high chirp mass bins, 
m = 2, 3, without discarding the BNS signals. This reduces 
the rate of false alarms, although at the prices of missing pos- 
sible gravitational-wave signals from other types of binaries, 
NSBH and BBH. Still, one could justify such search if it was 
known that NSBH and BBH binaries do not exist or are very 
rare. The uniform search, despite being less sensitive to BNS 
signals, allows one to detect the signals from all kinds of bi- 
naries. Such search still gains in efficiency over the standard 
search, the dotted curve, for BNS and NSBH systems, Fig- 
ures |2] and [3] At the same time, Figure [4] shows that such 
search does worse in comparison to the standard search in de- 
tecting BBH signals. This is an unavoidable consequence of 
re-weighting of triggers by the likelihood ratio, Eq. d29l based 
on their type and chirp mass. It ranks triggers from the lower 
chirp mass bins higher, because these triggers are less likely 
to be a noise artifact. This leads to some loss of sensitivity 
to BBH signals, but gains sensitivity to BNS and NSBH sig- 
nals. The role of the likelihood ratio is to provide optimal 
re-weighting of triggers that results in the highest overall effi- 
ciency. In the case of the uniform search, it should provide in- 
crease in the total number of detected sources of all types. To 
demonstrate that this is indeed the case, we plotted the com- 
bined efficiency of the uniform search for BNS, NSBH and 
BBH signals and compared it to the efficiency of the standard 
search, Figure \5\ 

The combined efficiency of the uniform search on Figure 
[5] is higher than that of the standard search because triggers 
are re-weighted by the likelihood ratio which properly ac- 
counts for the probability distributions of noise and signals. 
To gain further insight in this process we pick a particular 
point on the efficiency curve that corresponds to the rate of 
false alarms, x-axes, of 1.25 events per year. We find the cor- 
responding to this rate threshold for combined effective SNR 
in the standard search to be p* = 11.34. Next, we find the 
corresponding threshold for logarithm of the likelihood ratio, 
In A*(p c , a, m) = 9.11. For each (a,m) combination this 
value can be mapped to p c which will be different for each 
type of trigger. Both p* = 11.34 and In A*(p c , a, m) = 9.11 
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FIG. 5. Efficiency in detecting signals from any binary (BNS, NSBH 
or BBH) versus the false alarm rate computed for various rankings. 
The solid curve corresponds to the likelihood-ratio ranking, A, with 
uniform prior p s (Sj) = (1,1,1). The dotted curve represents the 
standard search with the combined effective SNR ranking, p e g. 



define detection surfaces in (p c , a, m) space of trigger param- 
eters. We depicted them on Figure ©. 

The signals falling to the right of p* = 11.34, the dashed 
line, are considered to be detected in the standard search. Sim- 
ilarly, the signals that happen to produce a trigger to the right 
of In A*(/9 C , a, m) = 9.11, the solid line, are considered to 
be detected in the uniform search with the likelihood-ratio 
ranking. The line of constant likelihood-ratio ranking sets dif- 
ferent thresholds for combined effective snr, p c , of the trig- 
gers depending on their type. The threshold is higher than 
p* = 11.34 for the H1L1 triggers from the third chirp mass 
bin. The signals producing triggers in the shaded area in this 
bin, labeled by (— ), are missed in the uniform search but de- 
tected by the standard search. These signals are typically cor- 
responds to BBH coalescence. The effect of this is visible on 
Figure the solid curve is below the dotted curve at false 
alarm rate of 1.25 events per year. On the other hand, the 
thresholds for other trigger types and chirp masses are lower 
than p* = 11.34. As a result, the signals producing triggers 
with parameters in the shaded regions labeled by (+) are de- 
tected in the uniform search but missed by the standard one. 
The net gain from detecting these signals is positive, Figure[3] 
The process of optimization of the search in (p c , a, m) param- 
eter space can be thought of as deformation of the detection 
curve, p* = 11.34, with the aim of maximizing efficiency 
of the search. The deformations are constrained to those that 
do not change the rate of false alarms. The optimal detection 
surface, as was shown in Section ITTBl Eq. (O, is the surface 
of constant likelihood ratio. This is the essence of likelihood 
ratio method. 

The power of the likelihood-ratio ranking depends strongly 
on the input data. For demonstration purpose, in the sim- 
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FIG. 6. The detection surfaces for the combined effective SNR, p c , 
and the likelihood-ratio, A, rankings at the false alarm rate of 1.25 
events per year. The y-axes labels different types and chirp mass 
bins of the triggers. The dashed line is the line of constant combined 
effective SNR, p* = 11.34. The solid line is the line of constant 
likelihood ratio, In A*(p c , a, m) = 9.11. The signal producing a 
trigger that falls to the right of the dashed/solid curve is considered 
to be detected in the search with combined effective SNR/likelihood- 
ratio ranking. Those triggers that fall to the left are missed. The 
shaded region is the difference between the detection region for the 
likelihood-ratio and the combined effective SNR rankings. The sig- 
nals that produce trigger with parameters in the shaded regions la- 
beled by (+) are gained in the search equipped with likelihood ratio 
but missed by the search with the combined effective SNR ranking. 
Those signals that produce a trigger in the shaded region labeled by 
(— ) are missed by the likelihood-ratio ranking but detected by the 
combined effective SNR ranking. 

ulation we restricted our attention to a subset of trigger 
parameters, (p c , a, m). We expect that inclusion of other pa- 
rameters such as difference in arrival times of the signal at 
different detectors, ratios of recovered amplitudes etc, should 
drastically improve the search. We leave this to future work. 

IV. CONCLUSION 

In this paper, we describe a general framework for design- 
ing optimal searches for transient gravitational-wave signals 
in data with non-Gaussian background noise. The principle 
quantity used in this method is the likelihood ratio, the ratio 
of the likelihood that the observed data contain signal to the 
likelihood that the data contain only noise. In Section HI1 we 
prove that the likelihood ratio leads to the optimal analysis 
of data, incorporating all available information. It is robust 
against increase of the data volume, effectively ignoring irrel- 
evant information. We apply the general formalism to two 
typical problems that arise in searches for gravitational-wave 
signals in LIGO data. 

First, in Section 1111 A I we show that when searching for 
gravitational-wave signals in the data from different experi- 
ments or detector configurations, it is necessary as well as suf- 



ficient to rank candidates by the "local" likelihood ratio given 
by Eq. d22l i, which is calculated using estimated local prob- 
abilities. This provides overall optimality across the experi- 
ments. Candidate events from different experiments can be 
compared directly in terms of their likelihood ratios. This re- 
sults in complete unification of the data analysis products. An- 
other significant feature of the unified analysis is that the can- 
didate's significance is independent of the duration of the ex- 
periments. Only the detectors' sensitivities and level of back- 
ground noise contribute to the likelihood ratio of the candi- 
dates. The experiment's duration, on the other hand, measures 
its contribution to the total probability of detecting a signal (or 
efficiency) and the total probability of a false alarm. 

Second, in Section HUB I we aim to improve efficiency of 
the search for gravitational waves from compact binary coa- 
lescence by considering the issue of consistent accounting for 
non-Gaussian features of the noise in the analysis. We suggest 
a practical solution to this problem. Estimate the probability 
distributions of parameters of the candidate events (e.g. SNR 
and the chirp mass of the template waveform, type of trigger 
etc) in the presence and absence of a signal in the data. Con- 
struct the likelihood ratio that includes non-Gaussian features 
and use it to re-rank candidate events. Non-trivial information 
contained in the probability distributions of candidate's param- 
eters allows for a more optimal evaluation of their significance. 
Indeed, as we demonstrate in the simulation, inclusion of the 
chirp mass and the type of trigger in the likelihood-ratio rank- 
ing results in a significant increase of efficiency in detecting 
signals from coalescing binaries. 

We would like to stress that the approach described in this 
paper is quite generic and can be applied to wide range of 
problems in analysis of data with non-Gaussian background. 
Its main advantage is consistent account of statistical informa- 
tion contained in the data. It provides a unified measure, in 
the form of the likelihood ratio, of the information relevant to 
detection of the signal in any type of data. This allows one to 
combine data of very different kind, such as the type of experi- 
ment, a type of trigger, its discrete and continuous parameters 
etc, into the single optimized analysis. 
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